Goto

Collaborating Authors

 unit type





Beyond Accuracy: An Empirical Study on Unit Testing in Open-source Deep Learning Projects

Wang, Han, Yu, Sijia, Chen, Chunyang, Turhan, Burak, Zhu, Xiaodong

arXiv.org Artificial Intelligence

Deep Learning (DL) models have rapidly advanced, focusing on achieving high performance through testing model accuracy and robustness. However, it is unclear whether DL projects, as software systems, are tested thoroughly or functionally correct when there is a need to treat and test them like other software systems. Therefore, we empirically study the unit tests in open-source DL projects, analyzing 9,129 projects from GitHub. We find that: 1) unit tested DL projects have positive correlation with the open-source project metrics and have a higher acceptance rate of pull requests, 2) 68% of the sampled DL projects are not unit tested at all, 3) the layer and utilities (utils) of DL models have the most unit tests. Based on these findings and previous research outcomes, we built a mapping taxonomy between unit tests and faults in DL projects. We discuss the implications of our findings for developers and researchers and highlight the need for unit testing in open-source DL projects to ensure their reliability and stability. The study contributes to this community by raising awareness of the importance of unit testing in DL projects and encouraging further research in this area.


SMAClite: A Lightweight Environment for Multi-Agent Reinforcement Learning

Michalski, Adam, Christianos, Filippos, Albrecht, Stefano V.

arXiv.org Artificial Intelligence

There is a lack of standard benchmarks for Multi-Agent Reinforcement Learning (MARL) algorithms. The Starcraft Multi-Agent Challenge (SMAC) has been widely used in MARL research, but is built on top of a heavy, closed-source computer game, StarCraft II. Thus, SMAC is computationally expensive and requires knowledge and the use of proprietary tools specific to the game for any meaningful alteration or contribution to the environment. We introduce SMAClite -- a challenge based on SMAC that is both decoupled from Starcraft II and open-source, along with a framework which makes it possible to create new content for SMAClite without any special knowledge. We conduct experiments to show that SMAClite is equivalent to SMAC, by training MARL algorithms on SMAClite and reproducing SMAC results. We then show that SMAClite outperforms SMAC in both runtime speed and memory.


A Framework for Understanding and Visualizing Strategies of RL Agents

Sequeira, Pedro, Elenius, Daniel, Hostetler, Jesse, Gervasio, Melinda

arXiv.org Artificial Intelligence

Recent years have seen significant advances in explainable AI as the need to understand deep learning models has gained importance with the increased emphasis on trust and ethics in AI. Comprehensible models for sequential decision tasks are a particular challenge as they require understanding not only individual predictions but a series of predictions that interact with environmental dynamics. We present a framework for learning comprehensible models of sequential decision tasks in which agent strategies are characterized using temporal logic formulas. Given a set of agent traces, we first cluster the traces using a novel embedding method that captures frequent action patterns. We then search for logical formulas that explain the agent strategies in the different clusters. We evaluate our framework on combat scenarios in StarCraft II (SC2), using traces from a handcrafted expert policy and a trained reinforcement learning agent. We implemented a feature extractor for SC2 environments that extracts traces as sequences of high-level features describing both the state of the environment and the agent's local behavior from agent replays. We further designed a visualization tool depicting the movement of units in the environment that helps understand how different task conditions lead to distinct agent behavior patterns in each trace cluster. Experimental results show that our framework is capable of separating agent traces into distinct groups of behaviors for which our approach to strategy inference produces consistent, meaningful, and easily understood strategy descriptions.


DSC Weekly Digest 16 Nov 2021: The Importance of Dimensional Modeling - DataScienceCentral.com

#artificialintelligence

When I was in high school, I had a superb chemistry teacher, something I, unfortunately, failed to appreciate until long after I went to college. For the first year of AP chemistry, we spent a huge amount of time working on what was at the time called unit analysis, though from a modeling perspective this is now known as dimensional analysis. It is, sadly, something of a lost art, and it's something that trips up people far more often than it should. Dimensional analysis, in its purest form, can be summarized as the statement "You can't compare apples to oranges." Put another way, if you add three apples to two oranges, you do not have five apples.


Harnessing the Power of Artificial Intelligence for Self-Storage Revenue Management

#artificialintelligence

Artificial intelligence (AI) is ubiquitous and set to be a significant driver of the world's economic activity in the next decade. It's a constellation of many technologies working in tandem to enable machines to sense, comprehend, act and learn with human-like levels of intelligence. Tools like machine learning (e.g., your credit card company sends a text about potentially fraudulent activity) and natural language processing (e.g., your phone helps you with the next likely word in a sentence) are part of the AI landscape. They'll continue to affect everything we do as we collect more data and enhance algorithms for better decision-making. As in other industries, AI will transform every layer of self-storage operation, too, including customer service, tenant access, security, finance, sales, marketing and revenue management (RM).


TotalBotWar: A New Pseudo Real-time Multi-action Game Challenge and Competition for AI

Estaben, Alejandro, Díaz, César, Montoliu, Raul, Pérez-Liebana, Diego

arXiv.org Artificial Intelligence

This paper presents TotalBotWar, a new pseudo real-time multi-action challenge for game AI, as well as some initial experiments that benchmark the framework with different agents. The game is based on the real-time battles of the popular TotalWar games series where players manage an army to defeat the opponent's one. In the proposed game, a turn consists of a set of orders to control the units. The number and specific orders that can be performed in a turn vary during the progression of the game. One interesting feature of the game is that if a particular unit does not receive an order in a turn, it will continue performing the action specified in a previous turn. The turn-wise branching factor becomes overwhelming for traditional algorithms and the partial observability of the game state makes the proposed game an interesting platform to test modern AI algorithms.


Hierarchical Decision Making by Generating and Following Natural Language Instructions

Hu, Hengyuan, Yarats, Denis, Gong, Qucheng, Tian, Yuandong, Lewis, Mike

arXiv.org Artificial Intelligence

We explore using latent natural language instructions as an expressive and compositional representation of complex actions for hierarchical decision making. Rather than directly selecting micro-actions, our agent first generates a latent plan in natural language, which is then executed by a separate model. We introduce a challenging real-time strategy game environment in which the actions of a large number of units must be coordinated across long time scales. We gather a dataset of 76 thousand pairs of instructions and executions from human play, and train instructor and executor models. Experiments show that models using natural language as a latent variable significantly outperform models that directly imitate human actions. The compositional structure of language proves crucial to its effectiveness for action representation. We also release our code, models and data.